Convolutional Pitch Target Approximation Model for Speech Synthesis
نویسندگان
چکیده
In this paper, we investigate pitch contour modelling in speech synthesis based on segmental units. A convolutional pitch target approximation model is proposed. This model allows jointly stochastic modelling of framewise pitch and pitch contour of longer units, of which the intuitive relations are revealed by a convolutional target approximation filter. The pitch contour is stylized by a linear representation called pitch target. In synthesis stage, the likelihood of the framewise model and the pitch target model are jointly maximized using a Toeplitz matrix representing the discrete convolutional filter. Index Terms Pitch modelling, speech synthesis, pitch target approximation.
منابع مشابه
Personalizing a speech synthesizer by voice adaptation
A voice adaptation system enables users to quickly create new voices for a text-to-speech system, allowing for the personalization of the synthesis output. The system adapts to the pitch and spectrum of the target speaker, using a probabilistic, locally linear conversion function based on a Gaussian Mixture Model. Numerical and perceptual evaluations reveal insights into the correlation between...
متن کاملModeling Pitch Contour of Chinese Mandarin Sentences with the PENTA Model
In continuous speech, the pitch contour of the same syllable may vary much due to its contextual information. The Parallel Encoding and Target Approximation (PENTA) model is applied here to Mandarin speech synthesis with a method to predict pitch contours for Chinese syllables with different contexts by combining the Classification And Regression Tree (CART) with the PENTA model to improve its ...
متن کاملModeling Pitch Contour of Chinese Mandarin Sentence with PENTA Model
In continuous speech, it is believed that the pitch contour of the same syllable may vary a lot due to its different context information. To apply the Parallel Encoding and Target Approximation (PENTA) model to Mandarin speech synthesis and improve its prediction accuracy, this paper proposed a method to predict pitch contours for Chinese syllables with different contexts by combining the Class...
متن کاملModeling Speech Melody as Communicative Functions with PENTAtrainer2
This paper presents PENTAtrainer2, a semi-automatic software package written as Praat plug-in integrated with Java programs, and its applications for analysis and synthesis of speech melody as communicative functions. Its core concepts are based on the Parallel Encoding and Target Approximation (PENTA) framework, the quantitative Target Approximation (qTA) model, and the simulated annealing opt...
متن کاملPitch target analysis of Thai tones using quantitative target approximation model and unsupervised clustering
This paper presents the integration between the quantitative target approximation (qTA) model and the unsupervised clustering technique to study Thai tones. The qTA model simulates F0 production on the basis of articulation process. Parameters extracted from the F0 of Thai speech by analysisand-synthesis method were further analyzed by K-means clustering. The number and form of pitch target wer...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013